Understanding Global Dynamics: Population,
Health Systems, Gender Equality
Teammates:
1. Jaynica Nunna 11697960
2. Vitesh Chalicheemala 11689328
3. Mounika Vankayalapati 11714417
4. Supriya Ravilla 11644767
Introduction
Our project centers on comprehending essential features of global dynamics which are
population trends, healthcare systems and gender equality.
We would like to reveal insights that can help individuals make decisions based on facts
as well as instigate positive changes in the society by dissecting and representing
information concerning these areas.
This project will therefore offer valuable insights into the intricate interconnections
leading to factors that shape todays world.
Data Abstraction
Dataset:
We utilized three datasets sourced from reputable sources such as the Kaggle and one from given ten choice of
datasets.
These datasets encompassed a wide range of attributes, including population demographics, healthcare infrastructure,
innovation indices, and gender inequality metrics.
Number of Records:
The combined datasets comprised thousands of records, with each record representing a specific country or region.
The exact number varied across datasets but totaled well over a thousand records in aggregate.
Data Transformation:
Before analysis and visualization, we performed extensive data preprocessing to clean and harmonize the datasets.
This involved tasks such as handling missing values, standardizing units of measurement, and ensuring data
consistency across different sources.
Above Workflow Explanation
Data Collection: Collection of data from Kaggle, a well-liked platform for datasets used in the field of data
science.
Initial Visualization:
Using D3.js to visualize uncleaned dataset with the intention of providing a first look at its structure as well as
possible insights.
Data Cleaning:
Cleaning up the data using Python together with libraries such as Pandas and NumPy that would help deal with
missing values, outliers, inconsistencies etc.
Refined Visualization:
These new visualizations will be created using Python's Matplotlib or Seaborn based on the cleaned dataset
which are more focused on particular variables that we need.
Dashboard Creation:
A comprehensive view of analyzed information can be obtained by creating interactive dashboards and reports
through Microsoft Power BI and incorporating these refined visualizations.
Report Generation:
Detailed reports outlining what was seen in the analysis, insights gained, and conclusions made based on
visualization results and activities involved in number-crunching.
Task Abstraction
Task:
Objective:
To examine and represent some important parts of the world’s dynamics including population demographics,
health care systems and gender equality indicators.
Actions:
Data Collection: Collect relevant data sets from Kaggle on population demographics, healthcare infrastructure,
innovation indices and gender inequality indexes.
Initial Visualization: Apply D3.js to make early views of the raw data file for an overview of its structure and
insights that can be gained from it.
Data Cleaning: Clean the data set using Python programming language with libraries such as Pandas and
NumPy that can handle missing values, outliers and inconsistencies.
Refined Visualization: Use either Matplotlib or Seaborn, which are Python’s data visualization libraries to
achieve more sophisticated visualizations based on this cleaned dataset, emphasizing on specific variables of
interest.
Dashboard Creation: Make use of Microsoft Power BI suite to create interactive dashboards and reports which
include integrated refined visuals that give a holistic picture of the analysis findings.
Report Generation: Create comprehensive reports that summarize analytical findings, highlight insights gained
through visualizations as well as data analysis procedures leading to informed decision making.
Implementation using Tools:
D3.js:
Description: The unprocessed dataset’s initial representation was done using D3.js (Data-Driven Documents). It provided a means
of creating real-time graphs and charts that could be directly included in web browsers.
Usage: D3.js helped us create interactive visualizations such as bar plots, scatter plots and heatmaps to explore the raw data
structure and patterns in it.
Python:
Description: Python with libraries like Pandas, NumPy, Matplotlib and Seaborn played a key role in preprocessing, analysis and
visualization of the data.
Usage: Missing values were imputed using Pandas and Numpy during data cleaning as well as removing any outliers and
consistencies. However, cleaned datasets were then used to plot refined visualizations through Matplotlib & Seaborn which provide
insights through multiple types of charts & plots.
Microsoft Power BI:
Description: Microsoft Power BI acted as an all-inclusive platform for building interactive dashboards and reports while
integrating the use of visualizations giving a holistic view of analyzed data.
Usage: Power BI allowed us to create interactive visualizations such as bar charts, line graphs or maps for instance and then
include them in interactive dashboards without any hindrances whatsoever. Besides data modeling capabilities in Power BI
facilitated relationship building.
Results for
Analysis
Using D3
This visualization represents the
population of the top ten countries in
2023.
Each bar's height corresponds to the
population size of a specific country.
By comparing the heights of the bars,
we can see which countries have larger
populations relative to others
And data used here is not
preprocessed or cleaned.
Using D3
we're visualizing the Global Innovation
Index (GII) of the top ten countries.
Each bar represents a country, and the
height of the bar indicates its GII score.
This visualization allows us to compare
the innovation levels of different
countries, with taller bars indicating
higher innovation scores.
And data used here is not
preprocessed or cleaned.
Data Pre-processing
using python
After preprocessing datasets, we stored
all three datasets into csv., using
pandas.
Later using Visualization libraries we
created below visuals.
Relationship between
pysicians and birth
registration
In terms of healthcare systems, each point in
this scatter plot represents a nation's journey
towards wellness.
As we see the axis of physicians per 1000
people, we uncover a tale of access to
healthcare.
Meanwhile, the completeness of birth
registration whispers a story of governance
and care, painting a picture of a nation's
commitment to its citizens' well-being.
On top we have shown same visual
using D3 , that’s before data
preprocessing.
Population health
heatmap
Across the world, this map paints a
picture of people's presence.
Each color showcases the density of
population, offering a straightforward
glimpse into where people gather, from
bustling cities to quieter regions.
Gender inequality index
map
This map provides a straightforward
view of gender inequality across the
world.
Darker colors indicate higher levels
of inequality, while lighter shades
represent areas with less disparity
between genders.
It's a clear snapshot of the global
landscape of gender equality efforts.
Next we are going to
create more visual insights
using powerBI by
embedding report here.
Microsoft Power BI
Microsoft Power BI
Microsoft Power BI
Microsoft Power BI
Links
https://raw.githubusercontent.com/viteshc/SDV_5320_Project_Group-4/main/1
https://raw.githubusercontent.com/viteshc/SDV_5320_Project_Group-4/main/2
https://raw.githubusercontent.com/viteshc/SDV_5320_Project_Group-4/main/3
Vizhub link:
https://vizhub.com/JaynicaNunna/b24d5d2b720346fa9c55699098eeac8b?edit=files&fi
le=script.js&tabs=package.json%7Escript.js
Power Bi:
https://app.powerbi.com/links/IacAgqpw1U?ctid=70de1992-07c6-480f-a318-
a1afcba03983&pbi_source=linkShare
Work Management: Implementation Status Report
Work Completed:
Data Collection:
Description: Gathered relevant datasets from Kaggle.
Responsibility: All team members contributed to identifying and selecting datasets.
Contributions: Equal contribution from all team members.
Initial Visualization (D3.js):
Description: Created preliminary visualizations of the uncleaned dataset using D3.js.
Responsibility: Vitesh Chalicheemala
Contributions: Vitesh contributed 100% to this task.
Data Cleaning (Python):
Description: Cleaned the dataset using Python, Pandas, and NumPy.
Responsibility: Jaynica Nunna, Mounika Vankayalapati
Contributions: Jaynica and Mounika each contributed 50% to this task.
Refined Visualization (Python):
Description: Created refined visualizations based on the cleaned dataset using Python's data visualization libraries.
Responsibility: Supriya Ravilla
Contributions: Supriya contributed 100% to this task.
Work Management: Implementation Status
Report
Dashboard Creation (Microsoft Power BI):
Description: Designed interactive dashboards and reports using Microsoft Power BI.
Responsibility: All team members collaborated on dashboard design and
implementation.
Contributions: Equal contribution from all team members.
Report Generation:
Description: Created detailed reports summarizing analysis findings.
Responsibility: Jaynica Nunna
Contributions: Jaynica contributed 100% to this task.
References
1. Kaggle. (n.d.). Kaggle: Your Home for Data Science. Retrieved from
https://www.kaggle.com/
2. D3.js. (n.d.). D3.js - Data-Driven Documents. Retrieved from https://d3js.org/
3. Python Software Foundation. (n.d.). Python. Retrieved from https://www.python.org/
4. McKinney, W., Perktold, J., Seabold, S., & Wes McKinney. (2020). pandas-dev/pandas:
Pandas 1.2.4. Zenodo. https://doi.org/10.5281/zenodo.4611042
5. Hunter, J. D. (2007). Matplotlib: A 2D Graphics Environment. Computing in Science &
Engineering, 9(3), 9095. https://doi.org/10.1109/MCSE.2007.55
6. Microsoft Corporation. (n.d.). Power BI. Retrieved from https://powerbi.microsoft.com/